MAINT Add run_initializers_async, Entra auth, and config-file support#1418
MAINT Add run_initializers_async, Entra auth, and config-file support#1418romanlutz wants to merge 3 commits intoAzure:mainfrom
Conversation
- Add run_initializers_async to pyrit.setup for programmatic initialization - Switch AIRTInitializer to Entra (Azure AD) auth, removing API key requirements - Add --config-file flag to pyrit_backend CLI - Use PyRIT configuration loader in FrontendCore and pyrit_backend - Update AIRTTargetInitializer with new target types Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR extends PyRIT’s initialization flow to support a lighter-weight “run initializers only” step, adds backend CLI support for YAML config files, and updates AIRT initialization to use Entra ID-based auth (with additional target configuration support).
Changes:
- Add
run_initializers_asynctopyrit.setupand refactor initialization to reuse it. - Add
--config-filesupport topyrit_backendand migrate CLI core flows to useConfigurationLoader. - Update AIRT initializer/targets (Entra auth in
AIRTInitializer, extra kwargs support + GPT-5 high-reasoning target inAIRTTargetInitializer) and adjust tests accordingly.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
pyrit/setup/initialization.py |
Adds run_initializers_async and makes initialize_pyrit_async delegate initializer execution to it. |
pyrit/setup/__init__.py |
Re-exports run_initializers_async from the pyrit.setup public API. |
pyrit/cli/frontend_core.py |
Switches to configuration-driven initialization; separates “initialize registries/memory” from “run initializers”. |
pyrit/cli/pyrit_backend.py |
Adds --config-file and uses FrontendCore for initialization + initializer execution before starting Uvicorn. |
pyrit/setup/initializers/airt.py |
Migrates AIRT initialization to Entra auth token providers (DefaultAzureCredential-based flow). |
pyrit/setup/initializers/airt_targets.py |
Adds extra_kwargs support for target construction and introduces a GPT-5 high-reasoning Responses target configuration. |
tests/unit/cli/test_pyrit_backend.py |
Adds unit tests for pyrit_backend argument parsing and config-file forwarding to FrontendCore. |
tests/unit/cli/test_frontend_core.py |
Updates unit tests for the new frontend initialization and initializer-running behavior. |
tests/unit/setup/test_airt_initializer.py |
Updates tests for Entra auth (removes API key requirements from required env var assertions). |
tests/unit/setup/test_airt_targets_initializer.py |
Adds coverage for the new GPT-5 high-reasoning target extra body parameters. |
| from dataclasses import dataclass | ||
| from typing import Any, Optional | ||
| from dataclasses import dataclass, field | ||
| from typing import Any, Dict, List, Optional, Type |
There was a problem hiding this comment.
List and Type are imported but unused, which will fail linting (ruff F401) in this repository. Please remove unused imports or use them (e.g., if you intended to type TARGET_CONFIGS/target_class).
| from typing import Any, Dict, List, Optional, Type | |
| from typing import Any, Dict, Optional |
| endpoint_var="OPENAI_VIDEO_ENDPOINT", | ||
| key_var="OPENAI_VIDEO_KEY", | ||
| model_var="OPENAI_VIDEO_MODEL", | ||
| ), |
There was a problem hiding this comment.
The video target config was renamed and switched from AZURE_OPENAI_VIDEO_* env vars to OPENAI_VIDEO_*. This breaks existing configs and is inconsistent with the rest of the repo (integration tests and docs still reference AZURE_OPENAI_VIDEO_ENDPOINT/KEY/MODEL). Consider keeping the AZURE_OPENAI_VIDEO_* variables (and azure_openai_video registry name) or supporting both sets for backward compatibility; also update the docstring section that still mentions AZURE_OPENAI_VIDEO_* accordingly.
| ), | |
| ), | |
| TargetConfig( | |
| registry_name="azure_openai_video", | |
| target_class=OpenAIVideoTarget, | |
| endpoint_var="AZURE_OPENAI_VIDEO_ENDPOINT", | |
| key_var="AZURE_OPENAI_VIDEO_KEY", | |
| model_var="AZURE_OPENAI_VIDEO_MODEL", | |
| ), |
There was a problem hiding this comment.
Agree with copilot with this one,
esp because of this docstring at the top:
Note: This module only includes PRIMARY endpoint configurations from .env_example. Alias configurations (those using ${...} syntax) are excluded since they reference other primary configurations.
| async def run_initializers_async( | ||
| *, | ||
| initializers: Optional[Sequence["PyRITInitializer"]] = None, | ||
| initialization_scripts: Optional[Sequence[Union[str, pathlib.Path]]] = None, | ||
| ) -> None: | ||
| """ | ||
| Run initializers and initialization scripts without re-initializing memory or environment. | ||
|
|
||
| This is useful when memory and environment are already set up (e.g. via | ||
| :func:`initialize_pyrit_async`) and only the initializer step needs to run. | ||
|
|
||
| Args: | ||
| initializers: Optional sequence of PyRITInitializer instances to execute directly. | ||
| initialization_scripts: Optional sequence of Python script paths containing | ||
| PyRITInitializer classes. | ||
|
|
||
| Raises: | ||
| ValueError: If initializers are invalid or scripts cannot be loaded. | ||
| """ | ||
| all_initializers = list(initializers) if initializers else [] | ||
|
|
||
| # Load additional initializers from scripts | ||
| if initialization_scripts: | ||
| script_initializers = _load_initializers_from_scripts(script_paths=initialization_scripts) | ||
| all_initializers.extend(script_initializers) | ||
|
|
||
| # Execute all initializers (sorted by execution_order) | ||
| if all_initializers: | ||
| await _execute_initializers_async(initializers=all_initializers) |
There was a problem hiding this comment.
run_initializers_async is a new public API surface but there are no unit tests covering its behavior (e.g., that it loads initializers from scripts, executes in execution_order, and does not reset defaults / reinitialize memory). Since initialize_pyrit_async already has orchestration tests, consider adding focused tests for this new function as well.
| PyRITInitializer classes. | ||
|
|
||
| Raises: | ||
| ValueError: If initializers are invalid or scripts cannot be loaded. |
There was a problem hiding this comment.
The run_initializers_async docstring says it raises ValueError when scripts cannot be loaded, but _load_initializers_from_scripts() can also raise FileNotFoundError (and potentially other exceptions during import). Please update the Raises: section to reflect the actual exception types callers should handle.
| ValueError: If initializers are invalid or scripts cannot be loaded. | |
| ValueError: If one or more provided initializers are invalid. | |
| FileNotFoundError: If an initialization script path does not exist. | |
| Exception: If an error occurs while importing initialization scripts or executing initializers. |
|
|
||
| assert context._database == frontend_core.SQLITE | ||
| assert context._initialization_scripts is None | ||
| assert context._initializer_names is None | ||
| assert context._initializer_names == ["airt", "airt_targets"] | ||
| assert context._log_level == logging.WARNING |
There was a problem hiding this comment.
test_init_with_defaults asserts FrontendCore()._initializer_names == ["airt", "airt_targets"], but ConfigurationLoader.load_with_overrides() defaults to an empty initializer list when ~/.pyrit/.pyrit_conf does not exist. That means _initializer_names will typically be None, making this test dependent on a developer machine’s home directory state and likely to fail on CI. Consider patching pyrit.setup.configuration_loader.DEFAULT_CONFIG_PATH.exists() to False (or passing an explicit temporary config_file) and asserting the deterministic default instead.
| registry_name="azure_openai_gpt5_responses_high_reasoning", | ||
| target_class=OpenAIResponseTarget, | ||
| endpoint_var="AZURE_OPENAI_GPT5_RESPONSES_ENDPOINT", | ||
| key_var="AZURE_OPENAI_GPT5_KEY", |
There was a problem hiding this comment.
this key var will be removed right ?
There was a problem hiding this comment.
we should keep the key here imo, if it exists in the .env we should load it. For our team, if we don't have a key, we should remove it from .env
| initializer_instances = [self.initializer_registry.get_class(name)() for name in self._initializer_names] | ||
|
|
||
| await run_initializers_async( | ||
| initializers=initializer_instances, |
There was a problem hiding this comment.
probably too big for here, but one thing on my mind is it'd be nice to pass constructor arguments to initializers
There was a problem hiding this comment.
Also probably execution order; rn it's built into the class but maybe it should actually be the order we call it or something.
| await run_initializers_async(initializers=initializers, initialization_scripts=initialization_scripts) | ||
|
|
||
|
|
||
| async def run_initializers_async( |
There was a problem hiding this comment.
I'm a little worried about this as is. I think there are three things we need to make sure we've thought through:
- Idempotency: There are initializers that can re-register or duplicate things if you re-run
reset_default_values: Right now this is called on setup, but if we run initializers directly this is not wiped clean. E.g. if you runSimpleInitializerand then laterAIRTInitializerthey can both set the sameconverter_targeton overlapping scopes. Without a reset, you get states that are tricky to debug.- Precondition enforcement: We'd need to verify
initialize_pyritis called first, env files are loaded, etc.
Because these are all lumped together, I like keeping it bundled. There is some performance hit because we're nuking the state and resetting things, reloading env... but I don't think it's a huge deal. And in some ways it's nice (e.g. if we update the .env file). It's a bit awkard to re-call, but I think it makes it less error prone.
The biggest reason I think we'd want to separate is if we're doing additive initializer execution, but I don't think that's something we want near term?
4337fb7 to
9642543
Compare
| initializer_instances = None | ||
| if self._initializer_names: | ||
| print(f"Running {len(self._initializer_names)} initializer(s)...") | ||
| sys.stdout.flush() | ||
| initializer_instances = [self.initializer_registry.get_class(name)() for name in self._initializer_names] | ||
|
|
||
| await run_initializers_async( | ||
| initializers=initializer_instances, | ||
| initialization_scripts=self._initialization_scripts, | ||
| ) |
There was a problem hiding this comment.
FrontendCore.run_initializers_async() instantiates initializers as self.initializer_registry.get_class(name)() (no constructor args). Since ConfigurationLoader supports initializer configs with optional args (InitializerConfig.args), config-file initialization currently can’t pass initializer parameters through FrontendCore. Consider preserving initializer configs (name + args) from ConfigurationLoader and instantiating with initializer_class(**args) when provided (or delegating to a ConfigurationLoader resolver) so --config-file can fully express initializer configuration.
| # Ensure context is initialized first (loads registries) | ||
| # This must happen BEFORE we run initializers to avoid double-initialization | ||
| if not context._initialized: | ||
| await context.initialize_async() | ||
|
|
||
| # Run initializers before scenario | ||
| initializer_instances = None | ||
| if context._initializer_names: | ||
| print(f"Running {len(context._initializer_names)} initializer(s)...") | ||
| sys.stdout.flush() | ||
|
|
||
| initializer_instances = [] | ||
|
|
||
| for name in context._initializer_names: | ||
| initializer_class = context.initializer_registry.get_class(name) | ||
| initializer_instances.append(initializer_class()) | ||
|
|
||
| # Re-initialize PyRIT with the scenario-specific initializers | ||
| # This resets memory and applies initializer defaults | ||
| await initialize_pyrit_async( | ||
| memory_db_type=context._database, | ||
| initialization_scripts=context._initialization_scripts, | ||
| initializers=initializer_instances, | ||
| env_files=context._env_files, | ||
| ) | ||
| # Resolve and run initializers + initialization scripts | ||
| await context.run_initializers_async() | ||
|
|
There was a problem hiding this comment.
run_scenario_async previously called initialize_pyrit_async(...) before each scenario run, which reloads env files and (critically) calls reset_default_values() to ensure initializer-applied defaults don’t leak between runs. The new flow (await context.run_initializers_async()) skips that reset/re-init step, so defaults and registry state set by initializers can accumulate across scenarios in the same process (and repeated runs may not be idempotent). Consider either keeping the per-run initialize_pyrit_async(...) call, or adding an explicit reset/reload option (e.g., call reset_default_values() and reload env) when running initializers for a scenario.
| TargetConfig( | ||
| registry_name="azure_openai_video", | ||
| registry_name="openai_video", | ||
| target_class=OpenAIVideoTarget, | ||
| endpoint_var="AZURE_OPENAI_VIDEO_ENDPOINT", | ||
| key_var="AZURE_OPENAI_VIDEO_KEY", | ||
| model_var="AZURE_OPENAI_VIDEO_MODEL", | ||
| underlying_model_var="AZURE_OPENAI_VIDEO_UNDERLYING_MODEL", | ||
| endpoint_var="OPENAI_VIDEO_ENDPOINT", | ||
| key_var="OPENAI_VIDEO_KEY", | ||
| model_var="OPENAI_VIDEO_MODEL", | ||
| ), |
There was a problem hiding this comment.
The video target config is now registered as openai_video and reads OPENAI_VIDEO_* env vars, but the initializer’s own docstring still documents AZURE_OPENAI_VIDEO_* and multiple integration tests reference AZURE_OPENAI_VIDEO_ENDPOINT/KEY/MODEL. This is a breaking change and will likely cause existing env setups/tests to fail. Consider keeping the azure_openai_video registry name and AZURE_OPENAI_VIDEO_* variables, or supporting both AZURE_OPENAI_VIDEO_* and OPENAI_VIDEO_* for backward compatibility (and update the docstring accordingly).
| """ | ||
| Run initializers and initialization scripts without re-initializing memory or environment. | ||
|
|
||
| This is useful when memory and environment are already set up (e.g. via | ||
| :func:`initialize_pyrit_async`) and only the initializer step needs to run. | ||
|
|
There was a problem hiding this comment.
run_initializers_async is now exported as a public setup API and is used as a lighter-weight alternative to initialize_pyrit_async, but it does not call reset_default_values() (which initialize_pyrit_async does before executing initializers). That distinction is easy to miss and can lead to confusing state leakage when callers run initializers multiple times. Consider explicitly documenting this in the docstring (and/or offering a reset_defaults flag) so callers can choose between clean-state and additive execution.
The changes in the following are simply waiting for #1404 and will disappear from this PR: